Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training

نویسندگان

  • Tsuneo Nitta
  • Akinori Kawamura
چکیده

The hybrid algorithm of SMQ (Statistical Matrix Quantization) and HMM shows high performance in vocabulary-unspecific, speaker-independent speech recognition, however, it needs lots of computation and memory at the stage of the segment quantizer of SMQ. In this paper, we propose a newly developed, two-stage segment quantizer with a feature extractor based on KL expansion and a classifier, that can be trained by using competitive training of KL/GPD. Result of experiments shows 1/30 1/40 reduction in both computation time and a memory size with the same performance that the old version of SMQ shows.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Audio-visual speech recognition using MCE-based hmms and model-dependent stream weights

This paper presents a framework for designing a hidden Markov model (HMM)-based audio-visual automatic speech recognition (ASR) system based on minimum classification error training. Audio/visual HMM parameters are optimized with the generalized probabilistic descent (GPD) method, and their likelihoods are combined using model-dependent stream weights which are also estimated with the GPD metho...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997